171 research outputs found

    Variance Loss in Variational Autoencoders

    Get PDF
    In this article, we highlight what appears to be major issue of Variational Autoencoders, evinced from an extensive experimentation with different network architectures and datasets: the variance of generated data is significantly lower than that of training data. Since generative models are usually evaluated with metrics such as the Frechet Inception Distance (FID) that compare the distributions of (features of) real versus generated images, the variance loss typically results in degraded scores. This problem is particularly relevant in a two stage setting, where we use a second VAE to sample in the latent space of the first VAE. The minor variance creates a mismatch between the actual distribution of latent variables and those generated by the second VAE, that hinders the beneficial effects of the second stage. Renormalizing the output of the second VAE towards the expected normal spherical distribution, we obtain a sudden burst in the quality of generated samples, as also testified in terms of FID.Comment: Article accepted at the Sixth International Conference on Machine Learning, Optimization, and Data Science. July 19-23, 2020 - Certosa di Pontignano, Siena, Ital

    Sparsity in Variational Autoencoders

    Full text link
    Working in high-dimensional latent spaces, the internal encoding of data in Variational Autoencoders becomes naturally sparse. We discuss this known but controversial phenomenon sometimes refereed to as overpruning, to emphasize the under-use of the model capacity. In fact, it is an important form of self-regularization, with all the typical benefits associated with sparsity: it forces the model to focus on the really important features, highly reducing the risk of overfitting. Especially, it is a major methodological guide for the correct tuning of the model capacity, progressively augmenting it to attain sparsity, or conversely reducing the dimension of the network removing links to zeroed out neurons. The degree of sparsity crucially depends on the network architecture: for instance, convolutional networks typically show less sparsity, likely due to the tighter relation of features to different spatial regions of the input.Comment: An Extended Abstract of this survey will be presented at the 1st International Conference on Advances in Signal Processing and Artificial Intelligence (ASPAI' 2019), 20-22 March 2019, Barcelona, Spai

    The Effectiveness of Data Augmentation for Detection of Gastrointestinal Diseases from Endoscopical Images

    Full text link
    The lack, due to privacy concerns, of large public databases of medical pathologies is a well-known and major problem, substantially hindering the application of deep learning techniques in this field. In this article, we investigate the possibility to supply to the deficiency in the number of data by means of data augmentation techniques, working on the recent Kvasir dataset of endoscopical images of gastrointestinal diseases. The dataset comprises 4,000 colored images labeled and verified by medical endoscopists, covering a few common pathologies at different anatomical landmarks: Z-line, pylorus and cecum. We show how the application of data augmentation techniques allows to achieve sensible improvements of the classification with respect to previous approaches, both in terms of precision and recall

    Smart matching

    Full text link
    One of the most annoying aspects in the formalization of mathematics is the need of transforming notions to match a given, existing result. This kind of transformations, often based on a conspicuous background knowledge in the given scientific domain (mostly expressed in the form of equalities or isomorphisms), are usually implicit in the mathematical discourse, and it would be highly desirable to obtain a similar behavior in interactive provers. The paper describes the superposition-based implementation of this feature inside the Matita interactive theorem prover, focusing in particular on the so called smart application tactic, supporting smart matching between a goal and a given result.Comment: To appear in The 9th International Conference on Mathematical Knowledge Management: MKM 201

    Variational Autoencoders and the Variable Collapse Phenomenon

    Get PDF
    In Variational Autoencoders, when working in high-dimensional latent spaces, there is a natural collapse of latent variables with minor significance, that get altogether neglected by the generator. We discuss this known but controversial phenomenon, sometimes referred to as overpruning, to emphasize the under-use of the model capacity. In fact, it is an important form of self-regularization, with all the typical benefits associated with sparsity: it forces the model to focus on the really important features, enhancing their disentanglement and reducing the risk of overfitting. In this article, we discuss the issue, surveying past works, and particularly focusing on the exploitation of the variable collapse phenomenon as a methodological guideline for the correct tuning of the model capacity, and of the loss function parameters

    Comparing the latent space of generative models

    Full text link
    Different encodings of datapoints in the latent space of latent-vector generative models may result in more or less effective and disentangled characterizations of the different explanatory factors of variation behind the data. Many works have been recently devoted to the explorationof the latent space of specific models, mostly focused on the study of how features are disentangled and of how trajectories producing desired alterations of data in the visible space can be found. In this work we address the more general problem of comparing the latent spaces of different models, looking for transformations between them. We confined the investigation to the familiar and largely investigated case of generative models for the data manifold of human faces. The surprising, preliminary result reported in this article is that (provided models have not been taught or explicitly conceived to act differently) a simple linear mapping is enough to pass from a latent space to another while preserving most of the information

    A Web Interface for Matita

    Full text link
    This article describes a prototype implementation of a web interface for the Matita proof assistant. The interface supports all basic functionalities of the local Gtk interface, but takes advantage of the markup to enrich the document with several kinds of annotations or active elements. Annotations may have both a presentational/hypertextual nature, aimed to improve the quality of the proof script as a human readable document, or a more semantic nature, aimed to help the system in its processing of the script. The latter kind comprises information automatically generated by the proof assistant during previous compilations, and stored to improve the performance of re-executing expensive operations like disambiguation or automation

    Balancing reconstruction error and Kullback-Leibler divergence in Variational Autoencoders

    Get PDF
    In the loss function of Variational Autoencoders there is a well known tension between two components: the reconstruction loss, improving the quality of the resulting images, and the Kullback-Leibler divergence, acting as a regularizer of the latent space. Correctly balancing these two components is a delicate issue, easily resulting in poor generative behaviours. In a recent work, Dai and Wipf obtained a sensible improvement by allowing the network to learn the balancing factor during training, according to a suitable loss function. In this article, we show that learning can be replaced by a simple deterministic computation, helping to understand the underlying mechanism, and resulting in a faster and more accurate behaviour. On typical datasets such as Cifar and Celeba, our technique sensibly outperforms all previous VAE architectures
    • …
    corecore